Using External Resources and Joint Learning for Bigram Weighting in ILP-Based Multi-Document Summarization

نویسندگان

Chen Li

Yang Liu

Lin Zhao

چکیده

Some state-of-the-art summarization systems use integer linear programming (ILP) based methods that aim to maximize the important concepts covered in the summary. These concepts are often obtained by selecting bigrams from the documents. In this paper, we improve such bigram based ILP summarization methods from different aspects. First we use syntactic information to select more important bigrams. Second, to estimate the importance of the bigrams, in addition to the internal features based on the test documents (e.g., document frequency, bigram positions), we propose to extract features by leveraging multiple external resources (such as word embedding from additional corpus, Wikipedia, Dbpedia, WordNet, SentiWordNet). The bigram weights are then trained discriminatively in a joint learning model that predicts the bigram weights and selects the summary sentences in the ILP framework at the same time. We demonstrate that our system consistently outperforms the prior ILP method on different TAC data sets, and performs competitively compared to other previously reported best results. We also conducted various analyses to show the contributions of different components.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Joint Optimization of User-desired Content in Multi-document Summaries by Learning from User Feedback

In this paper, we propose an extractive multi-document summarization (MDS) system using joint optimization and active learning for content selection grounded in user feedback. Our method interactively obtains user feedback to gradually improve the results of a state-of-the-art integer linear programming (ILP) framework for MDS. Our methods complement fully automatic methods in producing highqua...

متن کامل

Using Supervised Bigram-based ILP for Extractive Summarization

In this paper, we propose a bigram based supervised method for extractive document summarization in the integer linear programming (ILP) framework. For each bigram, a regression model is used to estimate its frequency in the reference summary. The regression model uses a variety of indicative features and is trained discriminatively to minimize the distance between the estimated and the ground ...

متن کامل

Fear the REAPER: A System for Automatic Multi-Document Summarization with Reinforcement Learning

This paper explores alternate algorithms, reward functions and feature sets for performing multi-document summarization using reinforcement learning with a high focus on reproducibility. We show that ROUGE results can be improved using a unigram and bigram similarity metric when training a learner to select sentences for summarization. Learners are trained to summarize document clusters based o...

متن کامل

Generating Summaries Using Sentence Compression and Statistical Measures

In this paper, we propose a compression based multi-document summarization technique by incorporating word bigram probability and word co-occurrence measure. First we implemented a graph based technique to achieve sentence compression and information fusion. In the second step, we use hand-crafted rule based syntactic constraint to prune our compressed sentences. Finally we use probabilistic me...

متن کامل

Multi-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression

Abstractive summarization is an ideal form of summarization since it can synthesize information from multiple documents to create concise informative summaries. In this work, we aim at developing an abstractive summarizer. First, our proposed approach identifies the most important document in the multi-document set. The sentences in the most important document are aligned to sentences in other ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Using External Resources and Joint Learning for Bigram Weighting in ILP-Based Multi-Document Summarization

نویسندگان

چکیده

منابع مشابه

Joint Optimization of User-desired Content in Multi-document Summaries by Learning from User Feedback

Using Supervised Bigram-based ILP for Extractive Summarization

Fear the REAPER: A System for Automatic Multi-Document Summarization with Reinforcement Learning

Generating Summaries Using Sentence Compression and Statistical Measures

Multi-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression

عنوان ژورنال:

اشتراک گذاری